Skip to content

fix: keep migrated agents running and harden migration paths#1

Open
simonovic86 wants to merge 1 commit intomainfrom
fix/migration-runtime-hardening
Open

fix: keep migrated agents running and harden migration paths#1
simonovic86 wants to merge 1 commit intomainfrom
fix/migration-runtime-hardening

Conversation

@simonovic86
Copy link
Owner

@simonovic86 simonovic86 commented Feb 17, 2026

Summary

  • start a managed tick loop for agents accepted via migration so they continue execution on the target
  • protect migration service active-agent bookkeeping with mutexes to avoid concurrent map access races
  • harden incoming migration handling by using secure temp WASM files and rejecting empty agent IDs
  • harden checkpoint storage paths by validating agent_id before building filesystem paths

Why

This addresses three concrete defects:

  1. migrated agents were marked started but never ticked on the target
  2. activeAgents map was unsafely accessed from concurrent stream handlers
  3. unsanitized agent IDs could influence filesystem paths

Validation

  • PATH="$HOME/go/bin:$PATH" make check
  • pre-commit hook passed (fmt-check, go vet, golangci-lint, go test ./...)

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89e30cb9b1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +298 to +302
if err := s.startManagedAgentLoop(pkg.AgentID, instance); err != nil {
s.logger.Error("Failed to start migrated agent", "error", err)
_ = instance.Close(ctx)
s.sendStartConfirmation(stream, pkg.AgentID, false, err.Error())
return

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clean up checkpoint when managed agent registration fails

If startManagedAgentLoop fails here (for example because registerManagedAgent rejects a duplicate agentID), the handler returns an error but leaves the just-written checkpoint file in place. In this path, the existing local agent keeps running while its persisted checkpoint has been overwritten by the incoming transfer, so a later restart/load can resume from the wrong state even though migration was rejected.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments